ABOUT THE SUBSEQUENCE FILES ON THIS DISK

1.  The transcription factors subsequence file was originally extracted from Ghosh's database of transcription factors.  A text file of the subsequence data was formatted by IBI into MacVector subsequence file format.

The file is very large, and thus can take several minutes to load.  Once MacVector's subsequence analysis is initiated, it may take five or more minutes of processing (making reverse complement copies of each site) before the actual search for sites begins.

Ghosh, D. (1990)  A Relational Database of Transcription Factors.
	Nucleic Acids Res 18:  1749-1756.
	
2.  The protein patterns file contains all of the patterns from the PROSITE database release 9.0 (June 1992) that could be represented using MacVector's subsequence file format. The refererences for these subsequences are in the text file named "patterns.refs," which can be opened within any word processor or text editor. Each reference is labeled with the name used in the protein patterns file as well as the name used in PROSITE.

3.  The other subsequence files were provided to IBI in MacVector format by Dr. Todd Wolf of Harvard University.
